Learning Networks from Random Walk-Based Node Similarities
نویسندگان
چکیده
Digital presence in the world of online social media entails significant privacy risks [6, 10, 33, 58, 34]. In this work we consider a privacy threat to a social network in which an attacker has access to a subset of random walk-based node similarities, such as effective resistances (i.e., commute times) or personalized PageRank scores. Using these similarities, the attacker’s goal is to infer as much information as possible about the underlying network, including any remaining unknown pairwise node similarities and edges. For the effective resistance metric, we show that with just a small subset of measurements, the attacker can learn a large fraction of edges in a social network (and in some cases all edges), even when the measurements are noisy. We also show that it is possible to learn a graph which accurately matches the underlying network on all other effective resistances. This second observation is interesting from a data mining perspective, since it can be expensive to accurately compute all effective resistances or other random walk-based similarities. As an alternative, our graphs learned from just a subset of approximate effective resistances can be used as surrogates in a wide range of applications that use effective resistances to probe graph structure, including for graph clustering, node centrality evaluation, and anomaly detection. We obtain our results by formalizing the graph learning objective mathematically, using two optimization problems. One formulation is convex and can be solved provably in polynomial time. The other is not, but we solve it efficiently with projected gradient and coordinate descent. We demonstrate the effectiveness of these methods on a number of social networks obtained from Facebook. We also discuss how our methods can be generalized to other random walk-based similarities, such as personalized PageRank scores. Our code is available at https://github.com/cnmusco/graph-similarity-learning. Yale University. [email protected] Massachusetts Institute of Technology. [email protected] Massachusetts Institute of Technology. [email protected] Boston University & Harvard University. [email protected]
منابع مشابه
Supervised Q-walk for Learning Vector Representation of Nodes in Networks
Automatic feature learning algorithms are at the forefront of modern day machine learning research. We present a novel algorithm, supervised Q-walk, which applies Q-learning to generate random walks on graphs such that the walks prove to be useful for learning node features suitable for tackling with the node classification problem. We present another novel algorithm, k-hops neighborhood based ...
متن کاملCoinciding Walk Kernels: Parallel Absorbing Random Walks for Learning with Graphs and Few Labels
Exploiting autocorrelation for node-label prediction in networked data has led to great success. However, when dealing with sparsely labeled networks, common in present-day tasks, the autocorrelation assumption is difficult to exploit. Taking a step beyond, we propose the coinciding walk kernel (cwk), a novel kernel leveraging label-structure similarity – the idea that nodes with similarly arra...
متن کاملLearning with Graphs using Kernels from Propagated Information
Traditional machine learning approaches are designed to learn from independent vector-valued data points.�e assumption that instances are independent, however, is not always true. On the contrary, there are numerous domains where data points are cross-linked, for example social networks, where persons are linked by friendship relations.�ese relations among data points make traditional machine l...
متن کاملSampling-based algorithm for link prediction in temporal networks
The problem of link prediction in temporal networks has attracted considerable recent attention from various domains, such as sociology, anthropology, information science, and computer science. In this paper, we propose a fast similarity-based method to predict the potential links in temporal networks. In this method, we first combine the snapshots of the temporal network into a weighted graph....
متن کاملA Novel Learning-based Search Algorithm for Unstructured Peer to Peer Networks
In order to file sharing as a popular application of unstructured peer to peer networks, finding a certain amount of data in each node, needs performing an appropriate search method. In this paper, we propose a new version of k-random walk algorithm using learning automata. In the proposed method, the value of k for k-random walk is not selected randomly but it is selected in an adaptive manner...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1801.07386 شماره
صفحات -
تاریخ انتشار 2018